05. Legacy Crawler
Step 3. Running the Legacy Crawler
Before we run the crawler, let's make sure it can write its results so that we can see it working! It'd be embarrassing to show this to your manager only to realize it's not fully functional!
First, find the "main" program file at src/main/java/com/udacity/webcrawler/main/WebCrawlerMain.java
. You should find two TODO
s there: one for writing the crawl output, and one for writing the profile output.
Don't worry about the profile output yet — you'll get to that later. For now, complete the first TODO
using the output path stored in the config
field. You will have to use the CrawlResultWriter
class that you just wrote. Create an instance of CrawlResultWriter
by passing in the CrawlResult
(the code that creates the CrawlResult
is already written).
Next, check the value of config.getResultPath()
. If it's a non-empty string, create a Path
using config.getResultPath()
as the file name, then pass that Path
to the CrawlResultWriter#write(Path)
method.
Alternatively, if the value of config.getResultPath()
is empty, the results should be printed to standard output (also known as System.out
).
Hint: There may be a standard Writer
implementation in java.io
(*cough* OutputStreamWriter
*cough*) that converts System.out
into a Writer
that can be passed to CrawlResultWriter#write(Writer)
.
Next, build the project (skipping tests, since they shouldn't all pass yet):
mvn package -Dmaven.test.skip=true
Finally, run the legacy crawler using the sample configuration file included with the project:
java -classpath target/udacity-webcrawler-1.0.jar \
com.udacity.webcrawler.main.WebCrawlerMain \
src/main/config/sample_config_sequential.json
Was the JSON result printed to the terminal?